3 research outputs found
Matrix concentration inequalities with dependent summands and sharp leading-order terms
We establish sharp concentration inequalities for sums of dependent random
matrices. Our results concern two models. First, a model where summands are
generated by a -mixing Markov chain. Second, a model where summands are
expressed as deterministic matrices multiplied by scalar random variables. In
both models, the leading-order term is provided by free probability theory.
This leading-order term is often asymptotically sharp and, in particular, does
not suffer from the logarithmic dimensional dependence which is present in
previous results such as the matrix Khintchine inequality.
A key challenge in the proof is that techniques based on classical cumulants,
which can be used in a setting with independent summands, fail to produce
efficient estimates in the Markovian model. Our approach is instead based on
Boolean cumulants and a change-of-measure argument.
We discuss applications concerning community detection in Markov chains,
random matrices with heavy-tailed entries, and the analysis of random graphs
with dependent edges.Comment: 69 pages, 4 figure
Detection and Evaluation of Clusters within Sequential Data
Motivated by theoretical advancements in dimensionality reduction techniques
we use a recent model, called Block Markov Chains, to conduct a practical study
of clustering in real-world sequential data. Clustering algorithms for Block
Markov Chains possess theoretical optimality guarantees and can be deployed in
sparse data regimes. Despite these favorable theoretical properties, a thorough
evaluation of these algorithms in realistic settings has been lacking.
We address this issue and investigate the suitability of these clustering
algorithms in exploratory data analysis of real-world sequential data. In
particular, our sequential data is derived from human DNA, written text, animal
movement data and financial markets. In order to evaluate the determined
clusters, and the associated Block Markov Chain model, we further develop a set
of evaluation tools. These tools include benchmarking, spectral noise analysis
and statistical model selection tools. An efficient implementation of the
clustering algorithm and the new evaluation tools is made available together
with this paper.
Practical challenges associated to real-world data are encountered and
discussed. It is ultimately found that the Block Markov Chain model assumption,
together with the tools developed here, can indeed produce meaningful insights
in exploratory data analyses despite the complexity and sparsity of real-world
data.Comment: 37 pages, 12 figure
Singular value distribution of dense random matrices with block Markovian dependence
A block Markov chain is a Markov chain whose state space can be partitioned
into a finite number of clusters such that the transition probabilities only
depend on the clusters. Block Markov chains thus serve as a model for Markov
chains with communities. This paper establishes limiting laws for the singular
value distributions of the empirical transition matrix and empirical frequency
matrix associated to a sample path of the block Markov chain whenever the
length of the sample path is with the size of the state
space.
The proof approach is split into two parts. First, we introduce a class of
symmetric random matrices with dependence called approximately uncorrelated
random matrices with variance profile. We establish their limiting eigenvalue
distributions by means of the moment method. Second, we develop a coupling
argument to show that this general-purpose result applies to block Markov
chains.Comment: 51 pages, 10 figure